feat: BYOE 0.8.0 - endpoints registry + openai-compat provider (REQ-142) by tbitcs · Pull Request #92 · BitConcepts/specsmith

tbitcs · 2026-05-01T11:58:14Z

What

Ships Bring-Your-Own-Endpoint (BYOE) support for OpenAI-v1-compatible LLM backends (vLLM, llama.cpp server, LM Studio, TGI, ...). Closes the user request to route Specsmith chat / serve through a self-hosted vLLM on the LAN.

Phases

PR-1 commit (323fd30): endpoints store + CLI group.
- src/specsmith/agent/endpoints.py — Endpoint / EndpointAuth / EndpointStore / EndpointHealth dataclasses; schema_version=1; JSON persistence at ~/.specsmith/endpoints.json with chmod 600; token resolution dispatch (none / bearer-inline / bearer-env / bearer-keyring); /v1/models health probe with TLS verify toggle.
- specsmith endpoints group with add / list / remove / default / test / models subcommands. Inline-token redaction on --json, hidden-input prompt for keyring path, --purge-keyring on remove.
- 38 new tests + docs/site/endpoints.md walkthrough + api_surface.json registers endpoints.
PR-2 commit (9ecd39e): provider driver + --endpoint flag.
- _run_openai_compat in chat_runner.py streams from the registered endpoint via raw stdlib HTTP / SSE (no openai SDK dependency). run_chat takes an optional endpoint_id; when set, the BYOE store is consulted and the resolved endpoint short-circuits the auto-detect provider chain. Failure modes (unreachable, 401, missing default model) fall back gracefully.
- --endpoint <id> flag on specsmith chat and serve. serve resolves the endpoint at startup, derives provider+model, and exports SPECSMITH_ACTIVE_ENDPOINT.
- 4 new e2e tests against an in-process fake /v1/chat/completions SSE server.
Release commit (f155fa4): pyproject.toml → 0.8.0.

Validation

ruff check + ruff format --check clean for the new and modified files.
mypy clean for src/specsmith/agent/endpoints.py (the strict-mode tier).
pytest tests/test_endpoints_store.py tests/test_endpoints_cli.py tests/test_chat_runner_openai_compat.py tests/test_warp_parity_followup.py tests/test_warp_parity.py → 82 passing.

How to test on your workstation

Pull / install specsmith 0.8.0 (or run the dev branch in editable mode).
specsmith endpoints add --id home-vllm --name "Home vLLM" --base-url http://10.0.0.4:8000/v1 --default-model qwen2.5-coder --auth none --set-default.
specsmith endpoints test home-vllm.
specsmith chat --endpoint home-vllm "hello" — the response now streams from your vLLM, not Ollama / Anthropic / OpenAI.

Out of scope (PR-3 in the extension repo)

VS Code Settings tab for endpoints, session-level dropdown, specsmith.endpoints / specsmith.testEndpoint commands. Those land separately as a sibling PR (feat(extension): BYOE 0.8.0 - endpoints commands + bridge --endpoint plumbing (REQ-142) specsmith-vscode#46).

Co-Authored-By: Oz oz-agent@warp.dev

Phase 1 of the Bring-Your-Own-Endpoint sprint. Adds a generic OpenAI-v1-compatible endpoint registry so users can register self-hosted vLLM, llama.cpp server, LM Studio, and TGI backends and pick between them. - src/specsmith/agent/endpoints.py: Endpoint / EndpointAuth / EndpointStore / EndpointHealth dataclasses, schema_version=1, JSON persistence at ~/.specsmith/endpoints.json (chmod 600), token resolution dispatch (none / bearer-inline / bearer-env / bearer-keyring), /v1/models health probe with TLS verify toggle. - src/specsmith/cli.py: 'specsmith endpoints' group with add / list / remove / default / test / models subcommands. Inline-token redaction in --json output, optional bearer-keyring storage with hidden-input prompt, --purge-keyring on remove, --set-default on add. - tests/test_endpoints_store.py + tests/test_endpoints_cli.py: 38 new tests covering validation, round-trip, redaction, token resolution dispatch, and /v1/models health against an in-process fake server. - tests/fixtures/api_surface.json: registered 'endpoints' as a top-level command for REQ-140 stability. - docs/site/endpoints.md: BYOE walkthrough, auth strategy table, security notes, CLI reference. Validation: ruff lint clean, ruff format clean, mypy strict clean for the new module, pytest 66/66 passing across the new suites + the existing api-surface stability test. Co-Authored-By: Oz <oz-agent@warp.dev>

Phase 2 of the Bring-Your-Own-Endpoint sprint. Wires the registry from PR-1 into the chat surface and the persistent serve loop. - src/specsmith/agent/chat_runner.py: new _run_openai_compat driver streams from a registered Endpoint via raw stdlib HTTP / SSE (no openai SDK dependency). run_chat() takes an optional endpoint_id; when set, the BYOE store is consulted and the resolved endpoint short-circuits the auto-detect provider chain. Failure modes (unreachable, 401, missing default model) fall back gracefully. - src/specsmith/cli.py: 'specsmith chat --endpoint <id>' threads through to run_chat. 'specsmith serve --endpoint <id>' resolves the endpoint at startup, derives provider+model, and exports SPECSMITH_ACTIVE_ENDPOINT for downstream consumers. - tests/test_chat_runner_openai_compat.py: 4 new pytest cases against an in-process fake /v1/chat/completions SSE server. Covers happy-path streaming, missing default-model fallback, 401-on-bad-token fallback, and the run_chat entry point with endpoint_id resolution. Validation: ruff lint + format clean, 82/82 passing across the new + existing endpoint and warp parity suites. Co-Authored-By: Oz <oz-agent@warp.dev>

Bump pyproject.toml to 0.8.0 to ship the Bring-Your-Own-Endpoint feature (REQ-142): the new endpoints store + 'specsmith endpoints' CLI group (PR-1) and the openai-compat provider driver wired through 'specsmith chat / serve --endpoint <id>' (PR-2). Co-Authored-By: Oz <oz-agent@warp.dev>

tbitcs and others added 3 commits May 1, 2026 07:42

tbitcs merged commit d82a358 into develop May 4, 2026
12 checks passed

tbitcs deleted the feat/byoe-endpoints-store branch May 4, 2026 20:01

This was referenced May 4, 2026

release: sync develop -> main (0.10.0 + 0.10.1 follow-up) #95

Merged

release: v0.10.1 (multi-agent + BYOE + 0.10.1 follow-up sweep) #96

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: BYOE 0.8.0 - endpoints registry + openai-compat provider (REQ-142)#92

feat: BYOE 0.8.0 - endpoints registry + openai-compat provider (REQ-142)#92
tbitcs merged 3 commits intodevelopfrom
feat/byoe-endpoints-store

tbitcs commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

tbitcs commented May 1, 2026

What

Phases

Validation

How to test on your workstation

Out of scope (PR-3 in the extension repo)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant